Combined Spectral Subtraction and Cepstral Normalisation for Robust Speech Recognition
نویسندگان
چکیده
This paper presents an effective feature processing algorithm for robust speech recognition, based on combined spectral and cepstral processing. The spectral processing consists of FullWave Rectification Spectral Subtraction (FWR-SS) and Likelihood Controlled Instantaneous Noise Estimation (LCINE) while the cepstral processing is based on meanand variance normalisation. The combination is motivated by the fact that the (usually) one frame based spectral subtraction introduces large statistical mismatches between clean and enhanced noisy speech in the cepstral domain, resulting in a degradation of the recognition performance. The introduced cepstral processing is able, to some extent, to mitigate these mismatches and in this sense the two methods are not just combined but shown to be complementary. Statistical analyses as well as recognition experiments are conducted on the Aurora 2 database and a performance comparable to the much more complex ETSI advanced front-end is achieved.
منابع مشابه
Improving the performance of MFCC for Persian robust speech recognition
The Mel Frequency cepstral coefficients are the most widely used feature in speech recognition but they are very sensitive to noise. In this paper to achieve a satisfactorily performance in Automatic Speech Recognition (ASR) applications we introduce a noise robust new set of MFCC vector estimated through following steps. First, spectral mean normalization is a pre-processing which applies to t...
متن کاملForward masking on a generalized logarithmic scale for robust speech recognition
This paper examines the forward masking on the generalized logarithmic scale for robust speech recognition to both additive and convolutional noise. The forward masking in the dynamic cepstral (DyC) representation is based upon subtraction of a masking pattern from a current spectrum on a logarithmic spectral domain, whereas the proposed method intends to make a compromise between the logarithm...
متن کاملSpectral Normalisation MFCC Derived Features for Robust Speech Recognition
This paper presents a method for extracting MFCC parameters from a normalised power spectrum density. The underlined spectral normalisation method is based on the fact that the speech regions with less energy need more robustness, since in these regions the noise is more dominant, thus the speech is more corrupted. Less energy speech regions contain usually sounds of unvoiced nature where are i...
متن کاملA New Data Driven Method for Robust Speech Recognition
The conventional view on the problem of robustness in speech recognition is that performance degradation in ASR systems is due to mismatch between training and test conditions. If problem of robustness in ASR systems were considered as a mismatch between the training and testing conditions the solution would be to find a way to reduce it. Common approaches are: Data-Driven methods such as speec...
متن کاملDistant-Talking Speech Recognition Based on Spectral Subtraction by Multi-Channel LMS Algorithm
We propose a blind dereverberation method based on spectral subtraction using a multi-channel least mean squares (MCLMS) algorithm for distant-talking speech recognition. In a distant-talking environment, the channel impulse response is longer than the short-term spectral analysis window. By treating the late reverberation as additive noise, a noise reduction technique based on spectral subtrac...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005